skip to main content


Search for: All records

Creators/Authors contains: "Bonneau, Richard"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract

    Inferring gene regulatory networks (GRNs) from single-cell data is challenging due to heuristic limitations. Existing methods also lack estimates of uncertainty. Here we present Probabilistic Matrix Factorization for Gene Regulatory Network Inference (PMF-GRN). Using single-cell expression data, PMF-GRN infers latent factors capturing transcription factor activity and regulatory relationships. Using variational inference allows hyperparameter search for principled model selection and direct comparison to other generative models. We extensively test and benchmark our method using real single-cell datasets and synthetic data. We show that PMF-GRN infers GRNs more accurately than current state-of-the-art single-cell GRN inference methods, offering well-calibrated uncertainty estimates.

     
    more » « less
  2. Inferring gene regulatory networks (GRNs) from single-cell gene expression datasets is a challenging task. Existing methods are often designed heuristically for specific datasets and lack the flexibility to incorporate additional information or compare against other algorithms. Further, current GRN inference methods do not provide uncertainty estimates with respect to the interactions that they predict, making inferred networks challenging to interpret. To overcome these challenges, we introduce Probabilistic Matrix Factorization for Gene Regulatory Network inference (PMF-GRN). PMF-GRN uses single-cell gene expression data to learn latent factors representing transcription factor activity as well as regulatory relationships between transcription factors and their target genes. This approach incorporates available experimental evidence into prior distributions over latent factors and scales well to single-cell gene expression datasets. By utilizing variational inference, we facilitate hyperparameter search for principled model selection and direct comparison to other generative models. To assess the accuracy of our method, we evaluate PMF-GRN using the model organisms Saccharomyces cerevisiae and Bacillus subtilis, benchmarking against database-derived gold standard interactions. We discover that, on average, PMF-GRN infers GRNs more accurately than current state-of-the-art single-cell GRN inference methods. Moreover, our PMF-GRN approach offers well-calibrated uncertainty estimates, as it performs gene regulatory network (GRN) inference in a probabilistic setting. These estimates are valuable for validation purposes, particularly when validated interactions are limited or a gold standard is incomplete. 
    more » « less
  3. The previously reported Q is a thermoresponsive coiled-coil protein capable of higher-order supramolecular assembly into fibers and hydrogels with upper critical solution temperature (UCST) behavior. Here, we introduce a new coiled-coil protein that is redesigned to disfavor lateral growth of its fibers and thus achieve a higher crosslinking density within the formed hydrogel. We also introduce a favorable hydrophobic mutation to the pore of the coiled-coil domain for increased thermostability of the protein. We note that an increase in storage modulus of the hydrogel and crosslinking density is coupled with a decrease in fiber diameter. We further fully characterize our α-helical coiled-coil (Q2) hydrogel for its structure, nano-assembly, and rheology relative to our previous single domain protein, Q, over the time of its gelation demonstrating the nature of our hydrogel self-assembly system. In this vein, we also characterize the ability of Q2 to encapsulate the small hydrophobic small molecule, curcumin, and its impact on the mechanical properties of Q2. The design parameters here not only show the importance of electrostatic potential in self-assembly but also provide a step towards predictable design of electrostatic protein interactions. 
    more » « less
  4. INTRODUCTION Neurons are by far the most diverse of all cell types in animals, to the extent that “cell types” in mammalian brains are still mostly heterogeneous groups, and there is no consensus definition of the term. The Drosophila optic lobes, with approximately 200 well-defined cell types, provides a tractable system with which to address the genetic basis of neuronal type diversity. We previously characterized the distinct developmental gene expression program of each of these types using single-cell RNA sequencing (scRNA-seq), with one-to-one correspondence to the known morphological types. RATIONALE The identity of fly neurons is determined by temporal and spatial patterning mechanisms in stem cell progenitors, but it remained unclear how these cell fate decisions are implemented and maintained in postmitotic neurons. It was proposed in Caenorhabditis elegans that unique combinations of terminal selector transcription factors (TFs) that are continuously expressed in each neuron control nearly all of its type-specific gene expression. This model implies that it should be possible to engineer predictable and complete switches of identity between different neurons just by modifying these sustained TFs. We aimed to test this prediction in the Drosophila visual system. RESULTS Here, we used our developmental scRNA-seq atlases to identify the potential terminal selector genes in all optic lobe neurons. We found unique combinations of, on average, 10 differentially expressed and stably maintained (across all stages of development) TFs in each neuron. Through genetic gain- and loss-of-function experiments in postmitotic neurons, we showed that modifications of these selector codes are sufficient to induce predictable switches of identity between various cell types. Combinations of terminal selectors jointly control both developmental (e.g., morphology) and functional (e.g., neurotransmitters and their receptors) features of neurons. The closely related Transmedullary 1 (Tm1), Tm2, Tm4, and Tm6 neurons (see the figure) share a similar code of terminal selectors, but can be distinguished from each other by three TFs that are continuously and specifically expressed in one of these cell types: Drgx in Tm1, Pdm3 in Tm2, and SoxN in Tm6. We showed that the removal of each of these selectors in these cell types reprograms them to the default Tm4 fate. We validated these conversions using both morphological features and molecular markers. In addition, we performed scRNA-seq to show that ectopic expression of pdm3 in Tm4 and Tm6 neurons converts them to neurons with transcriptomes that are nearly indistinguishable from that of wild-type Tm2 neurons. We also show that Drgx expression in Tm1 neurons is regulated by Klumpfuss, a TF expressed in stem cells that instructs this fate in progenitors, establishing a link between the regulatory programs that specify neuronal fates and those that implement them. We identified an intronic enhancer in the Drgx locus whose chromatin is specifically accessible in Tm1 neurons and in which Klu motifs are enriched. Genomic deletion of this region knocked down Drgx expression specifically in Tm1 neurons, leaving it intact in the other cell types that normally express it. We further validated this concept by demonstrating that ectopic expression of Vsx (visual system homeobox) genes in Mi15 neurons not only converts them morphologically to Dm2 neurons, but also leads to the loss of their aminergic identity. Our results suggest that selector combinations can be further sculpted by receptor tyrosine kinase signaling after neurogenesis, providing a potential mechanism for postmitotic plasticity of neuronal fates. Finally, we combined our transcriptomic datasets with previously generated chromatin accessibility datasets to understand the mechanisms that control brain wiring downstream of terminal selectors. We built predictive computational models of gene regulatory networks using the Inferelator framework. Experimental validations of these networks revealed how selectors interact with ecdysone-responsive TFs to activate a large and specific repertoire of cell surface proteins and other effectors in each neuron at the onset of synapse formation. We showed that these network models can be used to identify downstream effectors that mediate specific cellular decisions during circuit formation. For instance, reduced levels of cut expression in Tm2 neurons, because of its negative regulation by pdm3 , controls the synaptic layer targeting of their axons. Knockdown of cut in Tm1 neurons is sufficient to redirect their axons to the Tm2 layer in the lobula neuropil without affecting other morphological features. CONCLUSION Our results support a model in which neuronal type identity is primarily determined by a relatively simple code of continuously expressed terminal selector TFs in each cell type throughout development. Our results provide a unified framework of how specific fates are initiated and maintained in postmitotic neurons and open new avenues to understanding synaptic specificity through gene regulatory networks. The conservation of this regulatory logic in both C. elegans and Drosophila makes it likely that the terminal selector concept will also be useful in understanding and manipulating the neuronal diversity of mammalian brains. Terminal selectors enable predictive cell fate reprogramming. Tm1, Tm2, Tm4, and Tm6 neurons of the Drosophila visual system share a core set of TFs continuously expressed by each cell type (simplified). The default Tm4 fate is overridden by the expression of a single additional terminal selector to generate Tm1 ( Drgx ), Tm2 ( pdm3 ), or Tm6 ( SoxN ) fates. 
    more » « less
  5. Fluorescent protein biomaterials have important applications such as bioimaging in pharmacological studies. Self-assembly of proteins, especially into fibrils, is known to produce fluorescence in the blue band. Capable of self-assembly into nanofibers, we have shown we can modulate its aggregation into mesofibers by encapsulation of a small hydrophobic molecule. Conversely, azobenzenes are hydrophobic small molecules that are virtually non-fluorescent in solution due to their highly efficient photoisomerization. However, they demonstrate fluorogenic properties upon confinement in nanoscale assemblies by reducing the non-radiative photoisomerization. Here, we report the fluorescence of a hybrid protein-small molecule system in which azobenzene is confined in our protein assembly leading to fiber thickening and increased fluorescence. We show our engineered protein Q encapsulates AzoCholine, bearing a photoswitchable azobenzene moiety, in the hydrophobic pore to produce fluorescent mesofibers. This study further investigates the photocontrol of protein conformation as well as fluorescence of an azobenze-containing biomaterial. 
    more » « less
  6. Abstract Motivation

    Machine learning models for predicting cell-type-specific transcription factor (TF) binding sites have become increasingly more accurate thanks to the increased availability of next-generation sequencing data and more standardized model evaluation criteria. However, knowledge transfer from data-rich to data-limited TFs and cell types remains crucial for improving TF binding prediction models because available binding labels are highly skewed towards a small collection of TFs and cell types. Transfer prediction of TF binding sites can potentially benefit from a multitask learning approach; however, existing methods typically use shallow single-task models to generate low-resolution predictions. Here, we propose NetTIME, a multitask learning framework for predicting cell-type-specific TF binding sites with base-pair resolution.

    Results

    We show that the multitask learning strategy for TF binding prediction is more efficient than the single-task approach due to the increased data availability. NetTIME trains high-dimensional embedding vectors to distinguish TF and cell-type identities. We show that this approach is critical for the success of the multitask learning strategy and allows our model to make accurate transfer predictions within and beyond the training panels of TFs and cell types. We additionally train a linear-chain conditional random field (CRF) to classify binding predictions and show that this CRF eliminates the need for setting a probability threshold and reduces classification noise. We compare our method’s predictive performance with two state-of-the-art methods, Catchitt and Leopard, and show that our method outperforms previous methods under both supervised and transfer learning settings.

    Availability and implementation

    NetTIME is freely available at https://github.com/ryi06/NetTIME and the code is also archived at https://doi.org/10.5281/zenodo.6994897.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less
  7. The ability to engineer a solvent-exposed surface of self-assembling coiled coils allows one to achieve a higher-order hierarchical assembly such as nano- or microfibers. Currently, these materials are being developed for a range of biomedical applications, including drug delivery systems; however, ways to mechanistically optimize the coiled-coil structure for drug binding are yet to be explored. Our laboratory has previously leveraged the functional properties of the naturally occurring cartilage oligomeric matrix protein coiled coil (C), not only for its favorable motif but also for the presence of a hydrophobic pore to allow for small molecule binding. This includes the development of Q, a rationally designed pentameric coiled coil derived from C. Here, we present a small library of protein microfibers derived from the parent sequences of C and Q bearing various electrostatic potentials with the aim to investigate the influence of higher-order assembly and encapsulation of candidate small molecule, curcumin. The supramolecular fiber size appears to be well-controlled by sequence-imbued electrostatic surface potential, and protein stability upon curcumin binding is well correlated to relative structure loss, which can be predicted by in silico docking. 
    more » « less
  8. Labeled protein-based biomaterials have become popular for various biomedical applications such as tissue-engineered, therapeutic, and diagnostic scaffolds. Labeling of protein biomaterials, including with ultrasmall superparamagnetic iron oxide (USPIO) nanoparticles, has enabled a wide variety of imaging and therapeutic techniques. These USPIO-based biomaterials are widely studied in magnetic resonance imaging (MRI), thermotherapy, and magnetically-driven drug delivery, which provide a method for direct and non-invasive monitoring of implants or drug delivery agents. Where most developments have been made using polymers or collagen hydrogels, shown here is the use of a rationally designed protein as the building block for a meso-scale fiber. While USPIOs have been chemically conjugated to antibodies, glycoproteins, and tissue-engineered scaffolds for targeting or improved biocompatibility and stability, these constructs have predominantly served as diagnostic agents and often involve harsh conditions for USPIO synthesis. Here, we present an engineered protein–iron oxide hybrid material comprised of an azide-functionalized coiled-coil protein with small molecule binding capacity conjugated via bioorthogonal azide–alkyne cycloaddition to an alkyne-bearing iron oxide templating peptide, CMms6, for USPIO biomineralization under mild conditions. The coiled-coil protein, dubbed Q, has been previously shown to form nanofibers and, upon small molecule binding, further assembles into mesofibers via encapsulation and aggregation. The resulting hybrid material is capable of doxorubicin encapsulation as well as sensitive -weighted MRI darkening for strong imaging capability that is uniquely derived from a coiled-coil protein. 
    more » « less